Apparatus and method for image coding and decoding

ABSTRACT

An image coding apparatus is provided. A selector receives a multiplexed transport stream that includes multimedia coding data. A demultiplexer separates a video stream from the multiplexed transport stream. A decoder reproduces the video stream as decoded video data. A coding generator receives multimedia information associated with the multimedia coding data and generates display control information. The display control information includes a mismatch flag which indicates whether a display mismatch condition exists between the video data and multimedia coding data. An output unit outputs the decoded video data, the multimedia coding data and the mismatch flag.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a divisional of U.S. application Ser. No. 09/872,147, filed Jun. 1, 2001, which claims priority from Japanese Application No. P2000-165298, filed Jun. 2, 2000, and Japanese Application No. P2001-001031, filed Jan. 9, 2001, the disclosures of which are hereby incorporated by reference herein.

BACKGROUND OF THE INVENTION

The present invention relates generally to an image coding apparatus and method, an image decoding apparatus and method, and a recording medium. More specifically, the present invention relates to an image coding apparatus and method, an image decoding apparatus and method, and a recording medium which are suitable for use in apparatus for re-encoding video streams and recording and reproducing the re-encoded video streams.

Digital television broadcasts such as European DVB (Digital Video Broadcast), American DTV (Digital Television) broadcast, and Japanese BS (Broadcast Satellite) digital broadcast use MPEG (Motion Picture Expert Group) 2 transport streams. A transport stream consists of continuous transport packets, each packet carrying video data or audio data, for example. The data length of one transport packet is 188 bytes.

Unlike analog television broadcasts, digital television broadcasts are capable of providing services added with multimedia coding data. In these services, data such as video data, audio data, character graphics data, and still picture data, for example, are associated with each other for transmission by the multimedia coding data. For the multimedia coding data, a coding method based on XML (Extensible Markup Language) is used in the Japanese BS digital broadcast, for example. The details of this method are disclosed in ARIB STD-B24 Data Coding And Transmission Specification for Digital Broadcasting, for example.

Data such as video data, audio data, character graphics data, and still picture data are each packetized into a transport packet for transmission.

FIGS. 1A and 1B show an example of synthesizing data to be transferred between the sending and receiving sides and a multimedia screen. As shown in FIG. 1A, the sending side sends to the receiving side video data, character graphics data for displaying buttons A through C, text data for displaying “XYZABC . . . ,” and multimedia coding data for relating these data to each other. The sending side generally denotes a television broadcast station, for example. However, herein it denotes a television broadcast station which includes a recording apparatus (the recording side) which receives and records data transmitted from broadcast stations, as shown in the example illustrated in FIG. 1A including the data which is output from this recording apparatus.

The multimedia coding data includes data which can synthesize on the receiving side video data, character graphics data, and text data and display the synthesized data. To be more specific, the multimedia coding data includes the data associated with the display positions of the video, character graphics, and text which are displayed by the size-associated data such as the multimedia plane (the display area of images on the television receiver, for example) size (plane_height and plane_width) and video display size (video_height and video_width), video data, character graphics data, and text data, as shown in FIG. 1B.

On the basis of the multimedia coding data, the receiving side processes the video data, the character graphics data, and the text data to display a resultant image, as shown in FIG. 1B.

Through the screen on which the above-mentioned image is displayed, the user can receive services such as displaying desired information in the video section by clicking button A corresponding to that information and obtaining, from the text data displayed in the bottom of the screen, the information associated with the matter displayed in the video section, for example.

If a television program carried by a transport stream transmitted from a digital television broadcast is recorded without change to a recording medium on the received side, the program can be recorded without its picture and audio qualities being deteriorated at all. However, in order to record as long a television program as possible to a recording medium having a limited recording capacity by presupposing a certain degree of picture quality deterioration, the received video stream must be decoded and then encoded again to lower the bit rate of the transport stream.

For example, the re-encoding of the video stream of a television program attached with multimedia coding data to lower its bit rate for recording may be implemented by sub-sampling the image to change writing blocks. However, this approach presents a problem of causing a mismatch in the relationship between the video stream resulting from re-encoding and the multimedia coding data. The following describes an example of this mismatch with reference to FIGS. 2A and 2B.

In the example shown in FIG. 2A, the sending side (the recording side) converts the original video writing block to a smaller picture frame at the time of re-encoding. Therefore, as shown in FIG. 2B, on the receiving side (the reproducing side), changes occur in the video display size and position, resulting in a display screen which is different from the display screen intended by the sending side (the display screen to be displayed on the basis of the data before being re-encoded).

SUMMARY OF THE INVENTION

According to an aspect of the invention, an image coding apparatus is provided. A selector receives a multiplexed transport stream that includes multimedia coding data. A demultiplexer separates a video stream from the multiplexed transport stream. A decoder reproduces the separated video stream as decoded video data. A coding generator receives multimedia information associated with the multimedia coding data and generates display control information. The display control information includes a mismatch flag which indicates whether a display mismatch condition exists between the video data and multimedia coding data. An output unit outputs the decoded video data, the multimedia coding data and the mismatch flag.

In accordance with this aspect of the invention, an encoder may be coupled to the decoder and may reproduce the video stream based on the multimedia information associated with the multimedia coding data and the video data. The output unit may comprise a writing unit that records the decoded video data, the multimedia coding data and the mismatch flag onto a recording medium. A coding controller may be coupled between the selector and the coding generator and may generate the multimedia information associated with the multimedia coding data. A data analyzer may be coupled between the selector and the coding controller and may detect at least a bit rate associated with the video stream. The display control information may include a re-encode flag which indicates whether the video data is re-encoded. The display control information may include a frame size change flag which indicates whether a size of a picture frame associated with the video data has been changed.

The foregoing aspects, features and advantages of the present invention will be further appreciated when considered with reference to the following detailed description and accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects of the invention will be seen by reference to the description, taken in connection with the accompanying drawings, in which:

FIGS. 1A and 1B are schematic diagrams illustrating a display screen to be shown on the basis of multimedia coding information;

FIGS. 2A and 2B are schematic diagrams illustrating a mismatch which takes place when a video stream is re-encoded;

FIG. 3 is a block diagram illustrating a recording apparatus practiced as one embodiment of the present invention;

FIGS. 4A and 4B illustrate the operation of a multiplexer shown in FIG. 3;

FIGS. 5A, 5B and 5C illustrate the processing by an arrival timestamp adding block;

FIG. 6 illustrates multimedia display sub-information;

FIG. 7 illustrates an example of ProgramInfo( ) syntax;

FIG. 8 illustrates an example of StreamCodingInfo( ) syntax;

FIG. 9 illustrates the meaning of stream_coding type;

FIG. 10 illustrates the meaning of video_format;

FIG. 11 illustrates the meaning of frame_rate;

FIG. 12 illustrates the meaning of display_aspect_ratio;

FIG. 13 is a flowchart describing the processing of coding AV stream and multimedia display sub-information;

FIG. 14 is a flowchart describing the coding processing to be executed for restricting the re-encoding of a multiplexed stream video including multimedia coding data;

FIG. 15 illustrates an example of an input transport stream;

FIG. 16 illustrates an example of a transport stream after the re-encoding of the video stream shown in FIG. 15;

FIG. 17 is a flowchart describing a recording rate control process by a recording apparatus shown in FIG. 3;

FIG. 18 is a flowchart describing another recording rate control process by the recording apparatus shown in FIG. 3;

FIG. 19 illustrates another example of a transport stream resulting from the re-encoding of the video stream;

FIG. 20 illustrates another example of the input transport stream;

FIG. 21 is a block diagram illustrating a configuration of a reproducing apparatus practiced as one embodiment of the present invention;

FIGS. 22A and 22B illustrate a display screen to be shown when multimedia display sub-information is added;

FIG. 23 is a block diagram illustrating another configuration of the recording apparatus practiced as one embodiment of the present invention;

FIG. 24 is a flowchart describing the processing of reproducing an AV stream which uses multimedia display sub-information;

FIG. 25 is a block diagram illustrating another configuration of the reproducing apparatus practiced as one embodiment of the present invention; and

FIG. 26 illustrates recording media.

DETAILED DESCRIPTION

This invention will be described in further detail by way of example with reference to the accompanying drawings. Now, referring to FIG. 3, there is shown a block diagram illustrating an exemplary configuration of a recording apparatus 1 practiced as one embodiment of the invention. A transport stream received at an antenna, not shown, is input in a selector 10. A program number (a channel number) specified by the user is also input from a terminal 11 to the selector 10. Referring to the received program number, the selector 10 extracts the specified program from the received transport stream and outputs a partial transport stream. The partial transport stream is input in a demultiplexer 12 and an analyzing block 13.

The partial transport stream input in the demultiplexer 12 is separated into a video stream and other streams (audio, still picture, character graphics, and multimedia coding data for example). The video stream thus obtained is output to a decoder 14. The other streams are output to a multiplexer 16. In addition to the transport packets other than video, the demultiplexer 12 outputs the output timing information in the input transport stream of these transport packets to the multiplexer 16.

The decoder 14 applies a predetermined decoding scheme, for example, MPEG2 to the input video stream and outputs the decoded video data to an encoder 15. Also, the decoder 14 outputs the stream information about the video stream obtained at decoding to a coding controller 18.

On the other hand, the analyzing block 13 analyzes the input transport stream to obtain the stream information about the non-video streams, for example, a bit rate, and outputs it to the coding controller 18. The stream information about the non-video streams output from the analyzing block 13, the video stream information output from decoder 14, and a stream recording bit rate output from a terminal 19 are input in the coding controller 18. From these data, the coding controller 18 sets the video data coding conditions (coding control information) to be executed by the encoder 15 and outputs these coding conditions to the encoder 15 and a coding block 20.

The coding controller 18 uses, as a bit rate to be allocated to the video data encoding, a value obtained by subtracting a total value (the data input from the analyzing block 13) of the bit rates of the non-video streams from a stream recording bit rate (the data input, via the terminal 19, from a controller, not shown, for controlling the operation of the recording apparatus 1, for example). The coding controller 18 sets coding control information such as bit rate and picture frame such that an optimum picture quality can be achieved with the bit rate thus obtained and outputs this coding control information to the encoder 15 and the coding block 20. The details of the coding control information will be described later with reference to FIGS. 15 through 20.

When a stream is recorded to a recording medium with a fixed rate, this stream recording bit rate becomes the fixed rate; if a stream is recorded with a variable bit rate, this stream recording bit rate is a mean bit rate per predetermined time. However, the maximum value of the variable bit rate in this case needs to be lower than the maximum recording bit rate ensured by the recording medium concerned.

The encoder 15 encodes (on the basis of MPEG2, for example) the video data output from the decoder 14 on the basis of the coding control information output from the coding controller 18 and outputs the resultant video data to the multiplexer 16. The video stream from the encoder 15, the transport stream packets other than video from the demultiplexer 12, and the information about the occurrence timing of the transport stream packets other than video are input in the multiplexer 16. On the basis of the input occurrence timing information, the multiplexer 16 multiplexes the video stream with the transport stream packets, other than video, and outputs the result to the arrival timestamp adding block 17 as a transport stream.

FIGS. 4A and 4B schematically illustrate the above-mentioned processing to be executed by the multiplexer 16. FIG. 4A shows the timing of the input transport stream packets. In these figures, the cross-hatched portions indicate the video packets while the white portions indicate the stream packets other than video. As shown in FIG. 4A, the input transport stream packets are continuous; however, the data volume of the video data is reduced by the re-encoding of video data by the encoder 15. Consequently, the number of video packets is reduced.

As shown in FIG. 4B, the multiplexer 16 does not change the timing of the stream packets other than video but causes only the timing of the video packets to be different from the original state (shown in FIG. 4A).

As shown in FIGS. 5A, 5B and 5C, the arrival timestamp adding block 17 adds a header (TP_extra_header) including an arrival timestamp to each of the packets (FIG. 5A) of the input transport stream to generate a source packet (FIG. 5B), arranges the generated source packets continuously (FIG. 5C), and outputs them to a writing block 21. The arrival timestamp is information indicative of the timing with which the transport stream packets occur in a transport stream. The writing block 21 takes the input source packet stream consisting of continuous source packets and records the file to a recording medium 22. It should be noted that the recording medium 22 may be any type of recording medium.

The information output from the coding block 20 is also input in the writing block 21. On the basis of the video coding information from the coding controller 18, the coding block 20 generates multimedia display sub-information and outputs the same to the writing block 21. The multimedia display sub-information to be output to the writing block 21 is information for keeping the video display position and display size unchanged on multimedia plane from those of the image (the image which would be displayed without re-encoding) intended by the sending side even if the picture frame size has changed by transcoding (decoding by the decoder 14 and then encoding by the encoder 15) a video stream. This information also is used at the time of reproduction in combination with multimedia coding data.

The following describes the multimedia display sub-information more specifically. As shown in FIG. 6, the multimedia display sub-information consists of three flags of a mismatch flag (mismatch_MMinfo_flag), a re-encoded flag (Re_encoded_flag), and a frame size change flag (changed_frame_size_flag), data associated with two sizes indicative of an original horizontal size (original_horizontal_size) and an original vertical size (original_vertical_size), and an original screen aspect ratio (original_display_aspect_ratio).

The mismatch flag indicates whether there exists a mismatch in the relationship between video and multimedia coding data. The re-encoded flag indicates whether the video has been re-encoded at the time of recording. The frame size change flag indicates whether the picture frame of video has been changed by re-encoding, for example. The original horizontal size indicates the horizontal size of a picture frame before re-encoding. The original vertical size indicates the vertical size of a picture frame before re-encoding. The original screen aspect ratio indicates the aspect ratio of a frame screen before re-encoding.

It should be noted that the above-mentioned multimedia display sub-information is illustrative only. Therefore, information other than that shown in FIG. 6 may be included in, or part of the information shown in FIG. 6 may be excluded from, the multimedia display sub-information.

The following describes another example of the multimedia display sub-information. In the following example, the multimedia display sub-information is stored in a ProgramInfo( ) syntax shown in FIG. 7. The following describes the fields associated with the present invention in the ProgramInfo( ) syntax.

“length” indicates the number of bytes between the byte just after the length field and the last byte of ProgramInfo( ) inclusive.

“num_of_program_sequences” indicates the number of program sequences in the an AV stream file. A source packet sequence with which the program contents specified by this format in the AV stream file are constant is referred to as a program sequence.

“SPN_program_sequences_start” indicates an address at which the program sequence starts in the AV stream file. “SPN_program_sequences_start” is of a size in unit of source packet number and counted from the initial value 0 starting with the first packet of the AV stream file.

“program_map_PID” is value of the PID of a transport packet having PMT (Program Map Table) applicable to that program sequence.

“num_of_streams in_ps” indicates the number of elementary streams defined in that program sequence.

“stream_PID” indicates the value of the PID for the elementary stream defined in the PMT which is referenced by the program map PID of that program sequence.

“StreamCodingInfo( )” indicates the information about the elementary stream indicated by the above-mentioned stream PID.

FIG. 8 shows the syntax of StreamCodingInfo( ). “length” indicates the number of bytes between the byte just after this length field and the last byte of StreamCodingInfo( ) inclusive.

“stream_coding_type” indicates the coding type of the elementary stream indicated by the stream PID for this StreamCodingInfo( ). The meanings of the individual types are shown in FIG. 9.

If the value of stream coding type is 0×02, it indicates that the elementary stream indicated by the stream PID is a video stream.

If the value of stream coding type is 0×0A, 0×0B, or 0×0D, it indicates that the elementary stream indicated by the stream PID is multimedia coding data.

If the value of stream coding type is 0×06, it indicates that the elementary stream indicated by the stream PID is subtitles or teletext.

“video_format” indicates the video format of a video stream indicated by the stream PID for this StreamCodingInfo( ). The meanings of the individual video formats are shown in FIG. 10.

In FIG. 10, 480 i indicates video display of NTSC standard TV (interlace frame of 720 pixels×480 lines). 576 i indicates video display of PAL standard TV (interlace frame of 720 pixels×576 lines). 480 p indicates video display of progressive frame of 720 pixels×480 lines. 1080 i indicates video display of interlace frame of 1920 pixels×1080 lines. 720 p indicates video display of progressive frame of 1230 pixels×720 lines.

“frame_rate” indicates the frame rate of a video stream indicated by the stream PID for this StreamCodingInfo( ). The meanings of the individual frame rates are shown in FIG. 11.

“display_aspect_ratio” indicates the display aspect ratio of a video stream indicated by the stream PID for this StreamCodingIndo( ). The meaning of the individual display aspect ratios are shown in FIG. 12.

“original video_format_flag” indicates whether there exists original video format and original display aspect ratio in this StreamCodingInfo( ).

“original_video_format” indicates a video format before a video stream indicated by the stream PID for this StreamCodingInfo( ) is coded. The meanings of the individual original video formats are the same as shown in FIG. 10.

“original display_aspect ratio” is the display aspect ratio before a video stream indicated by the stream PID for this StreamCodingInfo( ) is coded. The meanings of the individual aspect ratios are the same as shown in FIG. 12.

It is assumed that, in transcoding a transport stream with a multimedia data stream (BML stream or subtitles) multiplexed along with a video stream, the re-encoding of the video stream changes its video format (for example, from 1080 i to 480 i), while the multimedia data stream retains its original stream contents. In this case, a mismatch in information may occur between a new video stream and the multimedia data stream. For example, although the parameters associated with the display of the multimedia data stream are determined on the supposition of the video format of the original video stream, the video format may be changed by the re-encoding of the video stream.

The video format of the original video stream is indicated by the video format and the display aspect ratio. The video format of the re-encoded video stream is indicated by the original video format and the original display aspect ratio.

If a mismatch exists between the values of the video format and the original video format and/or between the display aspect ratio and the original display aspect ratio, it indicates that a video format change has been caused by the video re-encoding at the time of recording.

If the stream PID in which the stream coding type indicates multimedia coding data and subtitles are included in ProgramInfo( ), it indicates that the multimedia data is multiplexed in an AV stream file (a transport stream).

If ProgramInfo( ) indicates that a video format change has been caused by the re-encoding of video at the time of recording and multimedia data is multiplexed in the AV stream file, then it is determined that a mismatch exists in display between the video stream (re-encoded) and the multimedia data (the original multimedia data) in the AV stream file.

In such a case, the information about the original video stream, namely the original video format and the original display aspect ratio, becomes effective. The reproducing apparatus generates a display screen from the above-mentioned new video stream and multimedia data stream as follows.

The video stream is up-sampled to a video format indicated by the original video format and the original display aspect ratio.

The up-sampled image and the multimedia data stream are synthesized to form a correct display screen.

The multimedia display sub-information generated by the coding block 20 is recorded by the writing block 21 to the recording medium 22 but stored as a file which is different from the source packet stream file output from the arrival timestamp adding block 17. If the multimedia display sub-information is recorded by the writing block 21 to the recording medium 22 as a file different from the source packet stream file, the filed multimedia display sub-information is output from the coding block 20.

FIG. 13 is a flowchart describing the processing of coding an AV stream and multimedia display sub-information.

In step 50, a multiplexed stream including multimedia coding data is input in the recording apparatus 1.

In step 51, the demultiplexer 12 separates the video stream from the multiplexed stream.

In step 52, the encoder 15 re-encodes the video stream decoded by the decoder 14.

In step 53, the multiplexer 16 multiplexes the above-mentioned video stream and multimedia coding data to generate a multiplexed stream.

In step 54, the coding block 20 generates multimedia display sub-information.

In the above description, the coding controller 18 generates the coding control information including bit rate and picture frame on the basis of the input data. The coding controller 18 may generate the following information as alternative coding control information. Namely, if the input transport stream is found to include multimedia coding data by the analyzing block 13, then the coding controller 18 may generate coding control information when encoding is executed by the encoder 15 for instructing the encoder 15 to execute the re-encoding with a picture frame (the picture frame before re-encoding) of the same size as that of the picture frame of the original video, and output the generated coding control information to the encoder 15.

When the above-mentioned method is used, the encoder 15 re-encodes the video data supplied from the decoder 14 with the same value as that of the picture frame of the original video stream on the basis of the input coding control information. If such coding control information is generated and the re-encoding is executed on the basis of the coding control information, no picture frame change is caused by the re-encoding, thereby preventing a mismatch from occurring in the relationship between the video stream obtained by re-encoding and the multimedia coding data.

Still alternatively, the following information may be generated as the coding control information generated by the coding controller 18. Namely, if the input transport stream is found to include multimedia coding data by the analyzing block 13, then the coding controller 18 may generate coding control information when encoding is executed by the encoder 15 for instructing the encoder 15 to execute the re-encoding under the same conditions as the video format (shown in FIG. 10) and screen aspect ratio (shown in FIG. 12) of the original video, and output the coding control information to the encoder 15.

When the above-mentioned method is used, the encoder 15 re-encodes the video supplied from the decoder 14 under the same conditions as the video format (shown in FIG. 10) and screen aspect ratio (shown in FIG. 12) of the original video on the basis of the input coding control information. If such coding control information is generated and the re-encoding is executed on the basis of the coding control information, no video format and no screen aspect ratio change is caused by the re-encoding, thereby preventing a mismatch from occurring in the relationship between the video stream obtained by re-encoding and the multimedia coding data.

FIG. 14 is a flowchart describing the coding for restricting the re-encoding of the video of a multiplexed stream including multimedia coding data.

In step 70, a multiplexed stream is input in the recording apparatus 1.

In step 71, the demultiplexer 12 separates the video stream from the multiplexed stream.

In step 72, the analyzing block 13 checks if the multimedia coding data is included in the video stream. If the multimedia coding data is included, the analyzing block 13 sends the coding control information to the encoder 15 instructing the same to re-encode the video stream without changing the display format. On the basis of the supplied control information, the encoder 15 re-encodes the video stream.

In step 73, the multiplexer 16 generates a multiplexed stream including the above-mentioned video stream.

With reference to FIGS. 15 through 20, the following describes one example of control to be executed on the basis of the coding control information.

It is assumed here that a transport stream to be input to the selector 10 has a constant bit rate R_(I) as shown in FIG. 15, for example. The video stream and the non-video streams are coded by variable bit rates. In the example shown in FIG. 15, in unit time (for example, GOP) A, the bit rate of the video stream is R_(VA) and the bit rate of non-video streams is R_(OA). In unit time B, the bit rate of the video stream is R_(VB) and the bit rate of non-video streams is R_(OB). In unit time C, the bit rate of the video stream is R_(VC) and the bit rate of non-video streams is R_(OC).

If the transport stream as shown in FIG. 15 is re-encoded to output the transport stream having fixed bit rate S (S<R_(I)) as shown in FIG. 16 from the multiplexer 16, the coding controller 18 executes the processing described by the flowchart shown in FIG. 17.

First, in step S1, the coding controller 18 sets the bit rate to S (recording rate) of a transport stream to be output from the multiplexer 16 on the basis of a control signal input from a controller, not shown, via the terminal 19. Next, in step S2, the coding controller 18 determines non-video streams to be recorded and computes a maximum total value D of the bit rates of the determined streams.

The maximum value D is determined from the stream specification of the input transport stream. For example, if two audio streams are to be recorded in addition to the video stream, the maximum value D is 384×2 Kbps since the maximum value of the bit rate of one audio stream is 384 Kbps according to the Japanese digital BS broadcast stream specification.

In step S3, the coding controller 18 uses value C obtained by subtracting the maximum value D computed in step S2 from the recording bit rate set in step S1 (C=S−D), as a bit rate to be allocated to the re-encoding of the video data. In step S4, the coding controller 18 analyzes the coding information such as the video stream bit rate and picture frame from the video stream information output from the decoder 14.

In step S5, the coding controller 18 determines, on the basis of the value C computed in step S3 and the video stream coding information analyzed in step S4, a video coding parameter (video coding control information) such that an optimum picture quality is achieved.

For example, in the example shown in FIG. 16, value S is ½ of value R_(I). In the present example, the bit rate of steams other than video is the maximum value D, which is used without change as the bit rate of non-video steams in a multiplexed stream after re-encoding.

Then, video coding parameters are determined such that an optimum picture quality can be achieved within the range of (S−D). If the picture frame is controlled, the horizontal direction of a picture frame of 720×480 pixels, for example, is sampled by ½ into 360×480 pixels. The determined coding parameters (bit rate and picture angle) are supplied to the encoder 15 as video coding control information.

In step S6, on the basis of the video coding control information supplied from the coding controller 18, the encoder 15 re-encodes the video data of unit time (in this example, unit time A) to be processed now. In the example shown in FIG. 16, the actual bit rate R_(OA) is smaller than the maximum value D in unit time A; however, since the maximum value D is fixed, the video allocated bit rate becomes (S−D). A wasted portion Rsa which cannot be used for video coding occurs because the maximum value D is fixed. The wasted portion is filled with stuffing bits.

In step S7, the coding controller 18 determines whether there remains any stream to be re-encoded. If any streams remain to be re-encoded, the procedure returns to step S4 to repeat the above-mentioned processes.

If, in step S7, no more streams remain to be re-encoded, this processing comes to an end.

Thus, in the example shown in FIG. 16, in unit time B, the bit rate of non-video streams also is D and the video stream allocated bit rate is S−D because it is fixed. Stuffing bits are inserted in value R_(sb) (R_(sb)=S−(S−D)−R_(OB)=D−R_(OB)).

In unit time C, too, the bit rate of non-video streams is D and the video stream allocated bit rate is S−D. It should be noted that, in unit time C,

D=R_(OC), so that no stuffing bits exist.

Thus, in the example shown in FIG. 16, the video stream is coded with a fixed bit rate.

FIG. 18 is a flowchart describing a processing example in which the video re-encoding allocated bit rate is variable. First, in step S21, the coding controller 18 sets recording rate S on the basis of the information supplied via the terminal 19. Next, in step S22, the coding controller 18 analyzes the coding information of the video stream on the basis of the video stream information supplied from the decoder 14. The processes of steps S21 and S22 are the same as those of steps S1 and S4 of FIG. 17.

In step S23, the coding controller 18 computes, from the output of the analyzing block 13, the total bit rate B in each unit time of non-video streams.

In step S24, the coding controller 18 uses, as the video re-encoding allocated bit rate, value C (C=S−B) obtained by subtracting value B obtained in step S23 from value S obtained in S1.

In step S25, the coding controller 18 determines, on the basis of value C obtained in step S24 and a result of analysis of the video stream coding information obtained in step S22, video coding parameters such that an optimum picture quality is obtained. The determined coding parameters are output to the encoder 15.

In step S26, the encoder 15 re-encodes the video data of the current unit time on the basis of the coding parameters determined in step S25. Consequently, as shown in FIG. 19, for example, after allocation of R_(oa) (=R_(OA)) as the bit rate in unit time of non-video streams, the bit rate of the video stream is set to bit rate R_(va) specified by (S−R_(oa)).

In step S27, the coding controller 18 determines whether any streams remain to be processed. If any streams remain to be processed, the procedure returns to step S22 to repeat the above-mentioned processes. If no more streams remain to be processed, this processing comes to an end.

Thus, in unit time B, after allocation of bit rate R_(ob) (=S−R_(OB)) of non-video streams, the remaining R_(vb) (=S−R_(ob)) is the bit rate of the video stream. In unit time C, the bit rate of the video stream is set to R_(vc) (=S−R_(OC)), except for bit rate Roc of non-video streams.

Thus, in the present processing example, the bit rate of the video stream is variable and, therefore, no stuffing bit is needed or the number of stuffing bits can be reduced, thereby coding the video stream more efficiently.

In the above, the input transport stream has a fixed bit rate. The present invention also is applicable to an example in which the bit rate of the input transport stream is variable as shown in FIG. 20.

Consequently, a transport stream of longer content can be recorded to the recording medium 22 at a lower bit rate as required.

In addition, the above-mentioned novel embodiment prevents the qualities of audio data, still picture and character graphics data, multimedia coding data, and other non-video data from being conspicuously deteriorated. The non-video data is basically smaller in data volume than video data, so that reducing the bit rate of the non-video data in the same ratio as the bit rate of video data makes the effects on the non-video data relatively greater than those on video data. The novel embodiment can prevent these effects from being caused.

The following describes the reproduction of a source packet stream file recorded on the recording medium 22. Referring to FIG. 21, there is shown a block diagram illustrating the configuration of a reproducing apparatus practiced as one embodiment of the invention. A source packet stream file recorded on the recording medium 22 is read by a reading block 31. The reading block 31 also reads multimedia display sub-information recorded on the recording medium 22 as a file separate from the source packet stream file.

The source packet stream read by the reading block 31 is output to a arrival timestamp separating block 32 and the multimedia display sub-information is output to a synthesizing block 36. The arrival timestamp separating block 32 incorporates a reference clock. The arrival time stamp separating block 32 compares the reference clock with the value of the arrival timestamp added to the source packet of the input source packet stream and, when a match is found, removes the arrival timestamp from the source packet having the matching arrival timestamp, outputting the resultant packet to a demultiplexer 33 as a transport stream packet.

The demultiplexer 33 separates the input transport stream into a video/audio stream and data streams such as multimedia coding data, character graphics, text, and still picture. Of these separated data, the video/audio stream is output to an AV decoder 34, the multimedia coding data is output to the synthesizing block 36, and the data stream such as character graphics, text, and still picture is output to a character graphics/still picture decoder 35.

The AV decoder 34 separates the input video/audio stream into video data and audio data, decodes each data, and outputs the decoded audio data to an audio reproducing device, not shown, and the decoded video data to the synthesizing block 36. The character graphics/still picture decoder 35 decodes the input data stream, such as character graphics, text, and still picture, and outputs the decoded character graphics data, text data, and still picture data to the synthesizing block 36.

In the synthesizing block 36, the video data from the AV decoder 34, the multimedia coding data from the demultiplexer 33, the data from the character graphics/still picture decoder 35, and the multimedia display sub-information from the reading block 31 are input. Checking the mismatch flag (FIG. 6) of the input multimedia display sub-information, the synthesizing block 36 determines whether a mismatch exists in the relationship between the input video signal and the multimedia coding data.

If a mismatch exists between the value of video format and the value of original video format shown in FIG. 8 and/or a mismatch exists between the value of display aspect ratio and the original display aspect ratio, the synthesizing block 36 determines that a video format change has been caused by the video re-encoding at the time of recording, detecting a mismatch in the relationship between the input video signal and the multimedia encoding data. If no mismatch exists between the value of video format and the value of original video format and no mismatch exists between the value of display aspect radio and the value of original display aspect ratio, the synthesizing block 36 determines that no mismatch exists in the relationship between the input video signal and the multimedia coding data.

If a mismatch is found in the relationship between the input video signal and the multimedia coding data, the synthesizing block 36 further references the original horizontal size and vertical size of the multimedia display sub-information or references the original video format and the original display aspect ratio. Then, the synthesizing block 36 scale-converts the input video signal so that it can be displayed in a frame of the referenced size. On the basis of the multimedia coding data, the synthesizing block 36 outputs the video signal with the scale-converted video signal and the data, such as character graphics synthesized on a multimedia plane, to a television receiver, not shown, which serves as a display device.

On the other hand, if no mismatch is found in the relationship between the input video signal and the multimedia coding data, the synthesizing block 36 synthesizes the input video signal with other data on a multimedia plane without scale conversion and outputs the synthesized data.

Thus, recording the multimedia display sub-information and using it at the time of reproduction allow the receiving side to display a screen as intended on the sending side. Referring to FIG. 22, if the re-encoding on the sending side (recording side) results in a smaller video picture frame than the original, the size reduction is recorded as multimedia display sub-information, which is referenced at the time of reproduction. Consequently, because there exists no mismatch between video data and other data, the receiving side (the reproduction side) can display the same screen as the original.

FIG. 24 is a flowchart describing AV stream reproduction processing which uses multimedia display sub-information.

In step 60, a multiplexed stream including multimedia coding data is read from a recording medium and input in a reproduction device.

In step 61, multimedia display sub-information is input. This information is read from the recording medium in the case of the reproducing device shown in FIG. 21; in the case of a reproducing device shown in FIG. 25, this information is separated from the multiplexed stream.

In step 62, a video stream is separated from the multiplexed stream.

In step 63, the video stream is decoded.

In step S64, if a mismatch exists between the video data and the multimedia coding data, the synthesizing block 36 scale-converts the video data on the basis of the multimedia display sub-information.

In step 65, the synthesizing block 36 synthesizes the processed image and the multimedia data to generate a display image.

As described, the multimedia display sub-information may be recorded on the recording medium 22 as a file which is different from the source packet stream file containing character graphics data and video signals. Alternatively, the mutlimedia display sub-information may be embedded in a source packet stream file and then recorded on the recording medium 22. FIG. 23 shows the configuration of the recording apparatus 1 in which the multimedia display sub-information is embedded in a source packet stream file.

In comparison between the configuration of the recording apparatus 1 shown in FIG. 23 and the configuration shown in FIG. 3, the former outputs the multimedia display sub-information output from the coding block 20 and supplies this output to the multiplexer 16. The multiplexer 16 then generates a transport packet of the input multimedia display sub-information and embeds it into a source packet stream file, outputting the same to the arrival timestamp adding block 17. Instead of embedding the multimedia display sub-information into a source packet stream file as a transport packet, the multimedia display sub-information may be written to a user data area in an MPEG video stream.

In the present embodiment of the invention, video data may be re-encoded using other methods than that described above; for example, an input video stream may be converted in the DCT area to convert the coding parameters such as picture frame.

FIG. 25 shows the configuration of the reproducing apparatus 30 in which the multimedia display sub-information is embedded in a source packet stream file to be recorded on the recording medium 22. In comparison between the configuration of the reproducing apparatus shown in FIG. 25 and the configuration shown in FIG. 21, the former reads only the source packet stream through the reading block 31. The source packet stream read by the reading block 31 is input to the demultiplexer 33 via the arrival timestamp separating block 32.

The demultiplexer 33 extracts the multimedia display sub-information from the input source packet stream file and outputs the extracted information to the synthesizing block 36. The further processing is the same as that of the configuration shown in FIG. 5.

Thus, if the multimedia display sub-information is recorded as embedded in a source packet stream file, the receiving side can also obtain the video picture size and display position intended by the sending side.

In the present embodiment of the invention, a transport stream was used as an example. The present invention also is applicable to multiplexed streams such as a program stream.

The above-described sequence of processing operations can be executed by hardware as well as software. In the software approach, the recording apparatus 1 (and the reproducing apparatus 30) is constituted by a personal computer as shown in FIG. 26.

Referring to FIG. 26, a CPU (Central Processing Unit) 101 executes various processing operations as instructed by programs stored in a ROM (Read Only Memory) 102 or loaded from a storage block 108 into a RAM (Random Access Memory) 103. The RAM 103 also stores, as required, the data necessary for the CPU 101 to execute various processing operations.

The CPU 101, the ROM 102, and the RAM 103 are interconnected via a bus 104. The bus 104 also is connected to an input/output interface 105.

The input/output interface 105 is connected to an input block 106, such as a keyboard and a mouse, a display device such as a CRT or LCD, an output block 107, such as a speaker, a storage block 108 such as hard disk, and a communication block 109 such as modem or terminal adapter. The communication block 109 executes communication processing via a network.

The input/output interface 105 also is connected to a drive 110, as required, in which a magnetic disc 121, an optical disc 122, a magneto-optical disc 123, or a semiconductor memory 124 is loaded. Computer programs read from these storage media are installed in the storage block 108 as required.

The execution of a sequence of processing operations by software requires the use of a computer having a dedicated hardware device storing beforehand the programs constituting the software or a general-purpose computer in which these programs are installed, as required, from a recording medium.

The program recording medium for storing computer-readable and executable programs may be a package medium which is distributed to users providing programs and embodied by the magnetic disk 121 (including floppy disk), the optical disc 122 (including CD-ROM (Compact Disc-Read Only Memory) and DVD (Digital Versatile Disc)), the magneto-optical disk 123 (including MD (Mini Disk)), the semiconductor memory 124, a ROM 102 or a hard disk which is preinstalled in a personal computer and provided for users and on which the programs are stored temporarily or permanently as shown in FIG. 26.

It should be noted that the steps describing the programs to be stored on the program storage medium are not only executed in a time-dependent manner in the order described, but also in parallel or in a discrete manner.

As described, and according to the first image coding apparatus and method and the program stored in the first recording medium, a video steam is separated from a multiplexed stream containing multimedia coding data, a predetermined conversion process is performed on the separated video stream, and additional information indicative of a mismatch occurs when displaying the converted video stream on the basis of the multimedia coding data.

The first recording medium stores the converted video stream, the multimedia coding data, and the additional information indicative that a mismatch will occur when displaying the converted video stream on the basis of the above-mentioned multimedia coding data.

Consequently, in any case, the reproducing side can prevent a mismatch from occurring between the video stream and the multimedia coding data.

As described and according to the image decoding apparatus and method and the program stored in the second recording medium, a mismatch occurs when a video stream is separated from an input multiplexed stream, the separated video stream is decoded, and the decoded video stream is displayed on the basis of multimedia coding information. On the basis of the additional information about this mismatch occurrence, a predetermined conversion process is performed on the decoded video stream. This novel configuration prevents the mismatch from occurring between the video stream and the multimedia coding data.

As described and according to the second image coding apparatus and method and the program stored in the third recording medium, a video stream is separated from an input multiplexed stream, the input multiplexed stream is checked for multimedia coding data and, if the multimedia coding data is found, coding control information for giving an instruction not to change the display format of the separated video stream is generated, and a predetermined conversion process is performed on the separated video stream on the basis of the generated coding control information.

The second recording medium also stores the above-mentioned coding control information giving instruction not to change the display format of a video stream and a multiplexed stream containing the video stream on which a predetermined conversion process has been performed on the basis of the coding control information.

Consequently, in any case, the reproduction side can prevent a mismatch from occurring between the video stream and the multimedia coding data.

Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. 

1. An image coding apparatus, comprising: a selector operable to receive a multiplexed transport stream that includes multimedia coding data; a demultiplexer operable to separate a video stream from the multiplexed transport stream; a decoder operable to reproduce the separated video stream as decoded video data; a coding generator operable to receive multimedia information associated with the multimedia coding data and to generate display control information, the display control information including a mismatch flag which indicates whether a display mismatch condition exists between the video data and multimedia coding data; and an output unit operable to output the decoded video data, the multimedia coding data and the mismatch flag.
 2. The image coding apparatus of claim 1, further comprising an encoder which is coupled to said decoder and which is operable to reproduce the video stream based on the multimedia information associated with the multimedia coding data and the video data.
 3. The image coding apparatus of claim 1, wherein said output unit comprises a writing unit operable to record the decoded video data, the multimedia coding data and the mismatch flag onto a recording medium.
 4. The image coding apparatus of claim 1, further comprising a coding controller which is coupled between said selector and said coding generator and which is operable to generate the multimedia information associated with the multimedia coding data.
 5. The image coding apparatus of claim 1, further comprising a data analyzer which is coupled between said selector and said coding controller and which is operable to detect at least a bit rate associated with the video stream.
 6. The image coding apparatus of claim 1, wherein the display control information includes a re-encode flag which indicates whether the video data is re-encoded.
 7. The image coding apparatus of claim 1, wherein the display control information includes a frame size change flag which indicates whether a size of a picture frame associated with the video data has been changed.
 8. An image coding method, comprising: receiving a multiplexed transport stream that includes multimedia coding data; separating a video stream from the multiplexed transport stream; reproducing the separated video stream as decoded video data; receiving multimedia information associated with the multimedia coding data; generating display control information that includes a mismatch flag which indicates whether a display mismatch condition exists between the video data and multimedia coding data; and outputting the decoded video data, the multimedia coding data and the mismatch flag.
 9. The image coding method of claim 8, further comprising: reproducing the video stream based on the multimedia information associated with the multimedia coding data and the video data.
 10. The image coding method of claim 8, wherein said outputting step includes writing the decoded video data, the multimedia coding data and the mismatch flag onto a recording medium.
 11. The image coding method of claim 8, further comprising: generating the multimedia information associated with the multimedia coding data.
 12. The image coding method of claim 8, further comprising: detecting at least a bit rate associated with the video stream.
 13. The image coding method of claim 8, wherein the display control information includes a re-encode flag which indicates whether the video data is re-encoded.
 14. The image coding method of claim 8, wherein the display control information includes a frame size change flag which indicates whether a size of a picture frame associated with the video data has been changed.
 15. A computer-readable medium having recorded instructions for carrying an image coding method, said method comprising: receiving a multiplexed transport stream that includes multimedia coding data; separating a video stream from the multiplexed transport stream; reproducing the separated video stream as decoded video data; receiving multimedia information associated with the multimedia coding data; generating display control information that includes a mismatch flag which indicates whether a display mismatch condition exists between the video data and multimedia coding data; and outputting the decoded video data, the multimedia coding data and the mismatch flag.
 16. The computer-readable medium of claim 15, wherein said image coding method further comprises: reproducing the video stream based on the multimedia information associated with the multimedia coding data and the video data.
 17. The computer-readable medium of claim 15, wherein said outputting step includes writing the decoded video data, the multimedia coding data and the mismatch flag onto a recording medium.
 18. The computer-readable medium of claim 15, wherein said image coding method further comprises: generating the multimedia information associated with the multimedia coding data.
 19. The computer-readable medium of claim 15, wherein said image coding method further comprises: detecting at least a bit rate associated with the video stream.
 20. The computer-readable medium of claim 15, wherein the display control information includes a re-encode flag which indicates whether the video data is re-encoded.
 21. The computer-readable medium of claim 15, wherein the display control information includes a frame size change flag which indicates whether a size of a picture frame associated with the video data has been changed. 